Skip to content

BUG: Fixed incorrect stream size check (#14125) #15195

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jeffcarey
Copy link
Contributor

@jeffcarey jeffcarey commented Jan 23, 2017

  • closes BUG: read_csv() crashes with engine='c' #14125
  • tests added / passed: test_large_difference_in_columns in c_parser_only.py, Other tests passed.
  • passes git diff upstream/master | flake8 --diff
  • whatsnew entry: Added to 0.20 whatsnew

Issue summary: Previously, self->stream_cap was copied into a local variable called maxstreamsize each time tokenize_bytes ran, and then this was checked in the PUSH_CHAR macro. However, there is one other place in the file where function make_stream_space() is called (in end_line()), and when this happens self->stream_cap is increased but maxstreamsize is not updated, making the check incorrect. In rare circumstances (see original issue or test case) this could cause a crash. The resolution is to just check self->stream_cap directly.

@jreback
Copy link
Contributor

jreback commented Jan 23, 2017

can you add the test from the issue?

@jreback jreback added the IO CSV read_csv, to_csv label Jan 23, 2017
@jeffcarey
Copy link
Contributor Author

I can, but a few things to consider first:
The test case from the issue does not follow a simple pattern, and it contains a row that is ~1000 characters long. So putting it into the test file is possible but would look kind of ugly. The test I had added instead causes the same issue, is a little easier to understand since it follows a pattern, and fits nicely into the test suite (in terms of formatting).

So with that said, do you still want the original test? If so, in an external file or directly in the script?

@jreback
Copy link
Contributor

jreback commented Jan 23, 2017

@jeffcarey oh if it replicates that is fine then. thanks!

@jreback jreback added this to the 0.20.0 milestone Jan 23, 2017
@jreback
Copy link
Contributor

jreback commented Jan 23, 2017

can you add a whatsnew note? ping when you push (as this is already validated).

@codecov-io
Copy link

codecov-io commented Jan 24, 2017

Current coverage is 85.97% (diff: 100%)

No coverage report found for master at 9309eba.

Powered by Codecov. Last update 9309eba...38c0cce

@jeffcarey
Copy link
Contributor Author

@jreback whatsnew note has been added.

@jreback jreback closed this in 64d7670 Jan 24, 2017
@jreback
Copy link
Contributor

jreback commented Jan 24, 2017

thanks @jeffcarey !

keep em coming!

@jreback jreback added the Bug label Jan 24, 2017
@jeffcarey jeffcarey deleted the fix/14125 branch January 26, 2017 01:19
AnkurDedania pushed a commit to AnkurDedania/pandas that referenced this pull request Mar 21, 2017
closes pandas-dev#14125

Previously, self->stream_cap was copied into a local variable called
maxstreamsize each time tokenize_bytes ran, and then this was checked
in the PUSH_CHAR macro. However, there is one other place in the file
where function make_stream_space() is called (in end_line()), and when
this happens self->stream_cap is increased but maxstreamsize is not
updated, making the check incorrect. In rare circumstances (see
original issue or test case) this could cause a crash. The resolution
is to just check self->stream_cap directly.

Author: Jeff Carey <[email protected]>

Closes pandas-dev#15195 from jeffcarey/fix/14125 and squashes the following commits:

d3c5f28 [Jeff Carey] BUG: Fixed incorrect stream size check (pandas-dev#14125)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: read_csv() crashes with engine='c'
3 participants